Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Major overhaul to audio clip retention methods, age and usa… #182

Merged
merged 3 commits into from
May 31, 2024

Conversation

tphakala
Copy link
Owner

…ge based policy options.

Retention settings have changed

New settings

  • Policy (none, age, usage)
  • MaxUsage (disk usage limit in percentage, default 80%)

Renamed settings

  • MinEvictionHours to MaxAge
  • minClipsPerSpecies to MinClips

MaxAge support defining retention time like:
24h as in 24 hour
1w as in 1 week
3m as in 3 months
1y as in 1 year

Retention struct {
     Debug    bool       // true to enable retention debug
     Policy   string        // retention policy, "none", "age" or "usage"
     MaxAge   string    // maximum age of audio clips to keep
     MaxUsage string  // maximum disk usage percentage before cleanup
     MinClips int          // minimum number of clips per species to keep
}

Copy link
Contributor

coderabbitai bot commented May 31, 2024

Walkthrough

The update transitions the audio clip retention system from a mode-based to a policy-based approach. Key changes include restructuring configuration parameters, adding new cleanup policies, and introducing utility functions for retention period parsing and file management. These modifications enhance flexibility in managing audio clip retention based on age, usage, and other criteria.

Changes

File(s) Change Summary
internal/analysis/realtime.go Updated clip cleanup logic to use specific retention policies instead of modes.
internal/conf/config.go, internal/conf/config.yaml, internal/conf/defaults.go Restructured retention settings, replacing mode and Enabled with policy, maxage, maxusage, and minclips.
internal/conf/utils.go Added ParseRetentionPeriod function to convert retention period strings into hours.
internal/diskmanager/file_utils.go Introduced shared file management functions and added fields to FileInfo struct.
internal/diskmanager/policy_age.go Implemented age-based cleanup policy for removing clips.
internal/diskmanager/policy_usage.go Introduced disk usage-based cleanup policy and supporting functions.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant User
    participant RealtimeAnalysis
    participant Config
    participant DiskManager

    User->>RealtimeAnalysis: Initiate Analysis
    RealtimeAnalysis->>Config: Fetch Retention Policy
    Config-->>RealtimeAnalysis: Return Policy Details
    RealtimeAnalysis->>DiskManager: Trigger Cleanup Based on Policy
    alt Policy is "age"
        DiskManager->>DiskManager: Perform Age-Based Cleanup
    else Policy is "usage"
        DiskManager->>DiskManager: Perform Usage-Based Cleanup
    end
    DiskManager-->>RealtimeAnalysis: Cleanup Complete
    RealtimeAnalysis-->>User: Analysis Complete
Loading

Poem

In the realm of code so bright,
Clips now fade with policy's might,
Age and usage take their turn,
As files in disk do twist and churn.
Debugging paths now clear and true,
A rabbit's joy in changes new! 🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@tphakala tphakala marked this pull request as ready for review May 31, 2024 14:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Outside diff range and nitpick comments (1)
internal/conf/config.go (1)

Line range hint 183-183: Ensure that errors are properly handled when loading settings.

- _, err := Load()
+ _, err := Load()
+ if err != nil {
+     log.Fatalf("Error loading settings: %v", err)
+ }
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between eda3be4 and d7ae595.

Files selected for processing (8)
  • internal/analysis/realtime.go (2 hunks)
  • internal/conf/config.go (1 hunks)
  • internal/conf/config.yaml (1 hunks)
  • internal/conf/defaults.go (1 hunks)
  • internal/conf/utils.go (1 hunks)
  • internal/diskmanager/file_utils.go (1 hunks)
  • internal/diskmanager/policy_age.go (1 hunks)
  • internal/diskmanager/policy_usage.go (1 hunks)
Additional context used
GitHub Check: lint
internal/conf/config.go

[failure] 183-183:
Error return value is not checked (errcheck)

Additional comments not posted (4)
internal/conf/config.yaml (1)

39-43: The configuration settings for retention policies are correctly updated to reflect the new system. However, ensure that the policy key's default value is consistent across all configuration files.

internal/conf/utils.go (1)

195-233: The ParseRetentionPeriod function is a crucial addition for handling various time formats in retention settings. Ensure that the error messages are clear and provide enough context to help in debugging.

internal/analysis/realtime.go (1)

115-115: LGTM! Ensure all references to the old retention check are updated.

internal/conf/config.go (1)

57-61: LGTM! The updated Settings struct aligns well with the new retention policies.

Comment on lines +219 to +227
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "age" {
if err := diskmanager.AgeBasedCleanup(quitChan); err != nil {
log.Println("Error cleaning up clips: ", err)
}
}

// priority based cleanup method
if conf.Setting().Realtime.Audio.Export.Retention.Mode == "priority" {
if conf.Setting().Realtime.Audio.Export.Debug {
log.Println("Running priority based cleanup")
}
if err := diskmanager.PriorityBasedCleanup(quitChan); err != nil {
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "usage" {
if err := diskmanager.UsageBasedCleanup(quitChan); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Consider adding more robust error handling for the cleanup functions.

- if err := diskmanager.AgeBasedCleanup(quitChan); err != nil {
-     log.Println("Error cleaning up clips: ", err)
- }
+ if err := diskmanager.AgeBasedCleanup(quitChan); err != nil {
+     log.Printf("Error cleaning up clips due to age policy: %v", err)
+ }
- if err := diskmanager.UsageBasedCleanup(quitChan); err != nil {
-     log.Println("Error cleaning up clips: ", err)
- }
+ if err := diskmanager.UsageBasedCleanup(quitChan); err != nil {
+     log.Printf("Error cleaning up clips due to usage policy: %v", err)
+ }
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "age" {
if err := diskmanager.AgeBasedCleanup(quitChan); err != nil {
log.Println("Error cleaning up clips: ", err)
}
}
// priority based cleanup method
if conf.Setting().Realtime.Audio.Export.Retention.Mode == "priority" {
if conf.Setting().Realtime.Audio.Export.Debug {
log.Println("Running priority based cleanup")
}
if err := diskmanager.PriorityBasedCleanup(quitChan); err != nil {
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "usage" {
if err := diskmanager.UsageBasedCleanup(quitChan); err != nil {
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "age" {
if err := diskmanager.AgeBasedCleanup(quitChan); err != nil {
log.Printf("Error cleaning up clips due to age policy: %v", err)
}
}
// priority based cleanup method
if conf.Setting().Realtime.Audio.Export.Retention.Policy == "usage" {
if err := diskmanager.UsageBasedCleanup(quitChan); err != nil {
log.Printf("Error cleaning up clips due to usage policy: %v", err)

Comment on lines +13 to +103
// AgeBasedCleanup removes clips from the filesystem based on their age and the number of clips per species.
func AgeBasedCleanup(quit <-chan struct{}) error {
settings := conf.Setting()

debug := settings.Realtime.Audio.Export.Retention.Debug
baseDir := settings.Realtime.Audio.Export.Path
minClipsPerSpecies := settings.Realtime.Audio.Export.Retention.MinClips
retentionPeriod := settings.Realtime.Audio.Export.Retention.MaxAge

retentionPeriodInHours, err := conf.ParseRetentionPeriod(retentionPeriod)
if err != nil {
log.Printf("Invalid retention period: %s\n", err)
return err
}

allowedExts := []string{".wav"}

if debug {
log.Printf("Starting age-based cleanup process. Base directory: %s, Retention period: %s", baseDir, retentionPeriod)
}

files, err := GetAudioFiles(baseDir, allowedExts, debug)
if err != nil {
return err
}

// Create a map to keep track of the number of files per species per subdirectory
speciesMonthCount := make(map[string]map[string]int)
for _, file := range files {
subDir := filepath.Dir(file.Path)
if _, exists := speciesMonthCount[file.Species]; !exists {
speciesMonthCount[file.Species] = make(map[string]int)
}
speciesMonthCount[file.Species][subDir]++
}

expirationTime := time.Now().Add(-time.Duration(retentionPeriodInHours) * time.Hour)

maxDeletions := 1000 // Maximum number of files to delete in one run
deletedFiles := 0 // Counter for the number of deleted files

for _, file := range files {
select {
case <-quit:
log.Printf("Cleanup interrupted by quit signal\n")
return nil
default:
if file.Timestamp.Before(expirationTime) {
subDir := filepath.Dir(file.Path)

if speciesMonthCount[file.Species][subDir] <= minClipsPerSpecies {
if debug {
log.Printf("Species clip count for %s in %s is at the minimum threshold (%d). Skipping file deletion.", file.Species, subDir, minClipsPerSpecies)
}
continue
}

if debug {
log.Printf("File %s is older than retention period, deleting.", file.Path)
}

err = os.Remove(file.Path)
if err != nil {
log.Printf("Failed to remove %s: %s\n", file.Path, err)
return err
}

speciesMonthCount[file.Species][subDir]--
deletedFiles++

if debug {
log.Printf("File %s deleted", file.Path)
}

// Check if we have reached the maximum number of deletions
if deletedFiles >= maxDeletions {
if debug {
log.Printf("Reached maximum number of deletions (%d). Ending cleanup.", maxDeletions)
}
return nil
}
}
}
}

if debug {
log.Printf("Age retention policy applied, total files deleted: %d", deletedFiles)
}

return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of AgeBasedCleanup appears robust and well-structured. However, consider adding more detailed logging for each step of the file deletion process, especially before and after deletion, to improve traceability and debugging.

+ log.Printf("Preparing to delete file: %s", file.Path)
  err = os.Remove(file.Path)
+ log.Printf("File deleted: %s", file.Path)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// AgeBasedCleanup removes clips from the filesystem based on their age and the number of clips per species.
func AgeBasedCleanup(quit <-chan struct{}) error {
settings := conf.Setting()
debug := settings.Realtime.Audio.Export.Retention.Debug
baseDir := settings.Realtime.Audio.Export.Path
minClipsPerSpecies := settings.Realtime.Audio.Export.Retention.MinClips
retentionPeriod := settings.Realtime.Audio.Export.Retention.MaxAge
retentionPeriodInHours, err := conf.ParseRetentionPeriod(retentionPeriod)
if err != nil {
log.Printf("Invalid retention period: %s\n", err)
return err
}
allowedExts := []string{".wav"}
if debug {
log.Printf("Starting age-based cleanup process. Base directory: %s, Retention period: %s", baseDir, retentionPeriod)
}
files, err := GetAudioFiles(baseDir, allowedExts, debug)
if err != nil {
return err
}
// Create a map to keep track of the number of files per species per subdirectory
speciesMonthCount := make(map[string]map[string]int)
for _, file := range files {
subDir := filepath.Dir(file.Path)
if _, exists := speciesMonthCount[file.Species]; !exists {
speciesMonthCount[file.Species] = make(map[string]int)
}
speciesMonthCount[file.Species][subDir]++
}
expirationTime := time.Now().Add(-time.Duration(retentionPeriodInHours) * time.Hour)
maxDeletions := 1000 // Maximum number of files to delete in one run
deletedFiles := 0 // Counter for the number of deleted files
for _, file := range files {
select {
case <-quit:
log.Printf("Cleanup interrupted by quit signal\n")
return nil
default:
if file.Timestamp.Before(expirationTime) {
subDir := filepath.Dir(file.Path)
if speciesMonthCount[file.Species][subDir] <= minClipsPerSpecies {
if debug {
log.Printf("Species clip count for %s in %s is at the minimum threshold (%d). Skipping file deletion.", file.Species, subDir, minClipsPerSpecies)
}
continue
}
if debug {
log.Printf("File %s is older than retention period, deleting.", file.Path)
}
err = os.Remove(file.Path)
if err != nil {
log.Printf("Failed to remove %s: %s\n", file.Path, err)
return err
}
speciesMonthCount[file.Species][subDir]--
deletedFiles++
if debug {
log.Printf("File %s deleted", file.Path)
}
// Check if we have reached the maximum number of deletions
if deletedFiles >= maxDeletions {
if debug {
log.Printf("Reached maximum number of deletions (%d). Ending cleanup.", maxDeletions)
}
return nil
}
}
}
}
if debug {
log.Printf("Age retention policy applied, total files deleted: %d", deletedFiles)
}
return nil
}
if debug {
log.Printf("File %s is older than retention period, deleting.", file.Path)
}
+ if debug {
+ log.Printf("Preparing to delete file: %s", file.Path)
+ }
err = os.Remove(file.Path)
+ if debug {
+ log.Printf("File deleted: %s", file.Path)
+ }
if err != nil {
log.Printf("Failed to remove %s: %s\n", file.Path, err)
return err
}

Comment on lines 43 to 46
viper.SetDefault("realtime.audio.export.retention.policy", "use")
viper.SetDefault("realtime.audio.export.retention.maxusage", "80%")
viper.SetDefault("realtime.audio.export.retention.maxage", "30d")
viper.SetDefault("realtime.audio.export.retention.minclips", 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that the default value for the policy key is correctly set to one of the valid options ("none", "age", "usage"). Currently, it is mistakenly set to "use", which could lead to configuration errors.

- viper.SetDefault("realtime.audio.export.retention.policy", "use")
+ viper.SetDefault("realtime.audio.export.retention.policy", "usage")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
viper.SetDefault("realtime.audio.export.retention.policy", "use")
viper.SetDefault("realtime.audio.export.retention.maxusage", "80%")
viper.SetDefault("realtime.audio.export.retention.maxage", "30d")
viper.SetDefault("realtime.audio.export.retention.minclips", 10)
viper.SetDefault("realtime.audio.export.retention.policy", "usage")
viper.SetDefault("realtime.audio.export.retention.maxusage", "80%")
viper.SetDefault("realtime.audio.export.retention.maxage", "30d")
viper.SetDefault("realtime.audio.export.retention.minclips", 10)

Comment on lines +25 to +56
// LoadPolicy loads the cleanup policies from a CSV file
func LoadPolicy(policyFile string) (*Policy, error) {
file, err := os.Open(policyFile)
if err != nil {
return nil, err
}
defer file.Close()

reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
return nil, err
}

policy := &Policy{
AlwaysCleanupFirst: make(map[string]bool),
NeverCleanup: make(map[string]bool),
}

for _, record := range records {
if len(record) != 2 {
return nil, errors.New("invalid policy record")
}
if record[1] == "always" {
policy.AlwaysCleanupFirst[record[0]] = true
} else if record[1] == "never" {
policy.NeverCleanup[record[0]] = true
}
}

return policy, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LoadPolicy function correctly handles CSV file reading and policy parsing. However, consider adding more specific error messages for different failure scenarios to enhance maintainability and debugging.

- return nil, errors.New("invalid policy record")
+ return nil, fmt.Errorf("invalid policy record: expected 2 columns, got %d", len(record))
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// LoadPolicy loads the cleanup policies from a CSV file
func LoadPolicy(policyFile string) (*Policy, error) {
file, err := os.Open(policyFile)
if err != nil {
return nil, err
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
return nil, err
}
policy := &Policy{
AlwaysCleanupFirst: make(map[string]bool),
NeverCleanup: make(map[string]bool),
}
for _, record := range records {
if len(record) != 2 {
return nil, errors.New("invalid policy record")
}
if record[1] == "always" {
policy.AlwaysCleanupFirst[record[0]] = true
} else if record[1] == "never" {
policy.NeverCleanup[record[0]] = true
}
}
return policy, nil
}
// LoadPolicy loads the cleanup policies from a CSV file
func LoadPolicy(policyFile string) (*Policy, error) {
file, err := os.Open(policyFile)
if err != nil {
return nil, err
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
return nil, err
}
policy := &Policy{
AlwaysCleanupFirst: make(map[string]bool),
NeverCleanup: make(map[string]bool),
}
for _, record := range records {
if len(record) != 2 {
return nil, fmt.Errorf("invalid policy record: expected 2 columns, got %d", len(record))
}
if record[1] == "always" {
policy.AlwaysCleanupFirst[record[0]] = true
} else if record[1] == "never" {
policy.NeverCleanup[record[0]] = true
}
}
return policy, nil
}

Comment on lines +19 to +143
}

if diskUsage > threshold {
if debug {
log.Printf("Disk usage %.1f%% is above the %.1f%% threshold. Cleanup needed.", diskUsage, threshold)
}

// Get the list of audio files
files, err := GetAudioFiles(baseDir, allowedExts, debug)
if err != nil {
return err
}

// Sort files by the cleanup priority and get the initial count of files per species per subdirectory
speciesMonthCount := sortFiles(files, debug)

// Debug: write sorted files to a file
if debug {
if err := WriteSortedFilesToFile(files, "file_cleanup_order.txt"); err != nil {
return err
}
}

// Perform the cleanup
return performCleanup(files, baseDir, threshold, minClipsPerSpecies, speciesMonthCount, debug, quitChan)
} else {
if debug {
log.Printf("Disk usage %.1f%% is below the %.1f%% threshold. No cleanup needed.", diskUsage, threshold)
}
}

return nil
}

func performCleanup(files []FileInfo, baseDir string, threshold float64, minClipsPerSpecies int, speciesMonthCount map[string]map[string]int, debug bool, quitChan chan struct{}) error {
// Delete files until disk usage is below the threshold or 100 files have been deleted
deletedFiles := 0
maxDeletions := 1000
totalFreedSpace := int64(0)

for _, file := range files {
select {
case <-quitChan:
log.Println("Received quit signal, ending cleanup run.")
return nil
default:
// Get the subdirectory name
subDir := filepath.Dir(file.Path)
month := file.Timestamp.Format("2006-01")

diskUsage, err := GetDiskUsage(baseDir)
if err != nil {
return err
}

// Check if disk usage is below threshold or max deletions reached
if diskUsage < threshold || deletedFiles >= maxDeletions {
// all done for now, exit select loop
break
}

if debug {
log.Printf("Species %s has %d clips in %s", file.Species, speciesMonthCount[file.Species][subDir], subDir)
}

if speciesMonthCount[file.Species][subDir] <= minClipsPerSpecies {
if debug {
log.Printf("Species clip count for %s in %s/%s is below the minimum threshold (%d). Skipping file deletion.", file.Species, month, subDir, minClipsPerSpecies)
}
continue
}
if debug {
log.Printf("Deleting file: %s", file.Path)
}

// Delete the file deemed for cleanup
err = os.Remove(file.Path)
if err != nil {
return err
}

// Increment deleted files count and update species count
deletedFiles++
speciesMonthCount[file.Species][subDir]--

// Add file size to total freed space
totalFreedSpace += file.Size

if debug {
log.Printf("File deleted. %d clips left for species %s in %s", speciesMonthCount[file.Species][subDir], file.Species, subDir)
}
}
}

if debug {
log.Printf("Usage retention policy applied, total files deleted: %d", deletedFiles)
}

return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of UsageBasedCleanup and performCleanup functions are well-structured and adhere to the new retention policy. Consider adding a feature to log the total freed space after cleanup to provide more detailed feedback on the cleanup process.

+ log.Printf("Total freed space: %d bytes", totalFreedSpace)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// UsageBasedCleanup cleans up old audio files based on the configuration and monitors for quit signals
func UsageBasedCleanup(quitChan chan struct{}) error {
settings := conf.Setting()
debug := settings.Realtime.Audio.Export.Retention.Debug
baseDir := settings.Realtime.Audio.Export.Path
minClipsPerSpecies := settings.Realtime.Audio.Export.Retention.MinClips
// Convert 80% string etc. to 80.0 float64
threshold, err := conf.ParsePercentage(settings.Realtime.Audio.Export.Retention.MaxUsage)
if err != nil {
return err
}
// Only remove files with extensions in this list
allowedExts := []string{".wav"}
if debug {
log.Printf("Starting cleanup process. Base directory: %s, Threshold: %.1f%%", baseDir, threshold)
}
// Check handle disk usage
diskUsage, err := GetDiskUsage(baseDir)
if err != nil {
return err
}
if diskUsage > threshold {
if debug {
log.Printf("Disk usage %.1f%% is above the %.1f%% threshold. Cleanup needed.", diskUsage, threshold)
}
// Get the list of audio files
files, err := GetAudioFiles(baseDir, allowedExts, debug)
if err != nil {
return err
}
// Sort files by the cleanup priority and get the initial count of files per species per subdirectory
speciesMonthCount := sortFiles(files, debug)
// Debug: write sorted files to a file
if debug {
if err := WriteSortedFilesToFile(files, "file_cleanup_order.txt"); err != nil {
return err
}
}
// Perform the cleanup
return performCleanup(files, baseDir, threshold, minClipsPerSpecies, speciesMonthCount, debug, quitChan)
} else {
if debug {
log.Printf("Disk usage %.1f%% is below the %.1f%% threshold. No cleanup needed.", diskUsage, threshold)
}
}
return nil
}
func performCleanup(files []FileInfo, baseDir string, threshold float64, minClipsPerSpecies int, speciesMonthCount map[string]map[string]int, debug bool, quitChan chan struct{}) error {
// Delete files until disk usage is below the threshold or 100 files have been deleted
deletedFiles := 0
maxDeletions := 1000
totalFreedSpace := int64(0)
for _, file := range files {
select {
case <-quitChan:
log.Println("Received quit signal, ending cleanup run.")
return nil
default:
// Get the subdirectory name
subDir := filepath.Dir(file.Path)
month := file.Timestamp.Format("2006-01")
diskUsage, err := GetDiskUsage(baseDir)
if err != nil {
return err
}
// Check if disk usage is below threshold or max deletions reached
if diskUsage < threshold || deletedFiles >= maxDeletions {
// all done for now, exit select loop
break
}
if debug {
log.Printf("Species %s has %d clips in %s", file.Species, speciesMonthCount[file.Species][subDir], subDir)
}
if speciesMonthCount[file.Species][subDir] <= minClipsPerSpecies {
if debug {
log.Printf("Species clip count for %s in %s/%s is below the minimum threshold (%d). Skipping file deletion.", file.Species, month, subDir, minClipsPerSpecies)
}
continue
}
if debug {
log.Printf("Deleting file: %s", file.Path)
}
// Delete the file deemed for cleanup
err = os.Remove(file.Path)
if err != nil {
return err
}
// Increment deleted files count and update species count
deletedFiles++
speciesMonthCount[file.Species][subDir]--
// Add file size to total freed space
totalFreedSpace += file.Size
if debug {
log.Printf("File deleted. %d clips left for species %s in %s", speciesMonthCount[file.Species][subDir], file.Species, subDir)
}
}
}
if debug {
log.Printf("Usage retention policy applied, total files deleted: %d", deletedFiles)
}
return nil
}
for _, file := range files {
select {
case <-quitChan:
log.Println("Received quit signal, ending cleanup run.")
return nil
default:
// Get the subdirectory name
subDir := filepath.Dir(file.Path)
month := file.Timestamp.Format("2006-01")
diskUsage, err := GetDiskUsage(baseDir)
if err != nil {
return err
}
// Check if disk usage is below threshold or max deletions reached
if diskUsage < threshold || deletedFiles >= maxDeletions {
// all done for now, exit select loop
break
}
if debug {
log.Printf("Species %s has %d clips in %s", file.Species, speciesMonthCount[file.Species][subDir], subDir)
}
if speciesMonthCount[file.Species][subDir] <= minClipsPerSpecies {
if debug {
log.Printf("Species clip count for %s in %s/%s is below the minimum threshold (%d). Skipping file deletion.", file.Species, month, subDir, minClipsPerSpecies)
}
continue
}
if debug {
log.Printf("Deleting file: %s", file.Path)
}
// Delete the file deemed for cleanup
err = os.Remove(file.Path)
if err != nil {
return err
}
// Increment deleted files count and update species count
deletedFiles++
speciesMonthCount[file.Species][subDir]--
// Add file size to total freed space
totalFreedSpace += file.Size
if debug {
log.Printf("File deleted. %d clips left for species %s in %s", speciesMonthCount[file.Species][subDir], file.Species, subDir)
}
}
}
if debug {
log.Printf("Usage retention policy applied, total files deleted: %d", deletedFiles)
log.Printf("Total freed space: %d bytes", totalFreedSpace)
}
return nil
}

@tphakala tphakala merged commit edbea2a into main May 31, 2024
8 of 9 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d7ae595 and 5bfa08d.

Files selected for processing (1)
  • internal/conf/defaults.go (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • internal/conf/defaults.go

@tphakala tphakala deleted the cleanup-overhaul branch October 20, 2024 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant